IBIS Macromodel Task Group Meeting date: 31 March 2020 Members (asterisk for those attending): Achronix Semiconductor * Hansel Dsilva ANSYS: * Curtis Clark * Wei-hsing Huang Cadence Design Systems: * Ambrish Varma Ken Willis * Jared James Intel: * Michael Mirmak Keysight Technologies: * Fangyi Rao Radek Biernacki Ming Yan * Todd Bermensolo Marvell Steve Parker Mentor, A Siemens Business: * Arpad Muranyi Micron Technology: * Randy Wolff * Justin Butterfield SiSoft (Mathworks): * Walter Katz Mike LaBonte Teraspeed Labs: * Bob Ross The meeting was led by Arpad Muranyi. Curtis Clark took the minutes. -------------------------------------------------------------------------------- Opens: - New Attendee: Jared James of Cadence introduced himself. He said he had worked with AMI models for many years. He had worked in Cadence's IP group, where he developed AMI models. He now works with Ambrish and Ken on the tool development side. ------------- Review of ARs: - Hansel to send an email to ATM about his step vs. pulse response question. - Done. The email was sent to ATM shortly before this meeting. -------------------------- Call for patent disclosure: - None. ------------------------- Review of Meeting Minutes: Arpad asked for any comments or corrections to the minutes of the March 24 meeting. Walter moved to approve the minutes. Randy seconded the motion. There were no objections. ------------- New Discussion: Discussion on "Gap in IBIS for sampling with statistical mode AMI models": Hansel reported that he is progressing on an initial BIRD draft. He is waiting for feedback from collaborators and hopes to have a presentation for ATM in two weeks. DDR Clock Forwarding: Arpad recapped the discussion from last week and noted that a major topic was whether a new function signature (AMI_GetWave2()) was required or use of the clock_times array as an input was sufficient. Fangyi shared a "Clock Forwarding BIRD Discussion" presentation he had created to advance the discussion. slide 2: Effects that Can Be Modeled by GetWave2 - With the DQS waveform available, one can model: - clock forwarding and DQ-DQS correlated jitter tracking - DQ slicer sensitivity to DQS slew rate - physical model of DQ Rx phase interpolator (PI) - nonlinearity and discretization in PI output delay - DQS jitter amplification by the PI - DQS correlated voltage noise on PI, slicer and DFE slide 3: PI Output Delay Nonlinearity and Discretization Fangyi noted that the "Ideal Output" figure shows the output waveforms for each value of n from 0 to N. A newly introduced figure demonstrates the non-uniform spacing of the crossing delays as n varies from 0 to N in the Ideal Input waveform case. Fangyi said that delay nonlinearity and discretization can only be captured by a physical model of the PI, and this is only possible with the full DQS waveform. Michael asked what defines the value of N. Fangyi said it is a design parameter that is fixed. Walter explained that a typical value of N is 32, and it is equivalent to a rotator with 32 taps. If the UI were 160ps, for example, then nominally each tap is separated by 5ps. One design for a PI might be a phase lock loop (PLL) running at a high frequency with 32 taps, and this would be quite linear. The PI we are discussing here is not a PLL, however, and this method of generating phase interpolation and moving the zero crossing has nonlinearities. Fangyi said that if you apply a low pass filter (LPF) after the PI output, then the output waveform will be smoother. This is shown in the "Output after LPF" figure, and it improves the linearity of the delay as a function of n. slide 4: DQS Jitter Amplification by the PI This slide demonstrates an example of jitter amplification. Given a case with tau1 = 0, tau2 = .5*UI (90 degrees), N=32, and no LPF, a 10% DCD on the input DQS waveform results in an Output DQS DCD that is greater than or equal to 10% and varies with n. The max value of the Output DCD is approximately 40% larger than the input DCD and occurs at n=16. The use of an LPF would increase the amplification. The effect can't be modeled without the DQS waveform. slide 5: DQS Correlated Voltage Noise Fangyi noted that DQ and DQS voltage noise can be correlated. If this effect is not considered, the eye width can be underestimated by as much as 10%. The full DQS waveform is needed to model voltage noise effects on the PI, slicer and hence the DFE. slide 6: GetWave2 vs GetWave This slide presents a table of the six important effects listed in slide 2. The table states that GetWave2 can be used to model all of them, while GetWave with clock_times as an input can only handle one of them. GetWave with an internal CDR in the model cannot capture any of the effects. GW with GW with Effect to Model GW2 clock_times internal CDR ------------------------------------------- --- ----------- ------------ Clock forwarding and DQ-DQS Jitter Tracking Yes Yes No DQ slicer sensitivity to slew rate Yes No No Physical model of PI Yes No No PI output nonlinearity and discretization Yes No No DQS correlated voltage noise Yes No No DQS jitter amplification by the PI Yes No No Fangyi said that these factors can critically affect system performance, and that several IC vendors specifically requested that these effects be modeled. Fangyi noted that GetWave is still supported, and GetWave and GetWave2 can coexist. slide 7: Simulation Flow Complexity Fangyi said that some had objected to possible complication of the flow. This slide shows a system block diagram and defines 3 steps: 1. Compute analog channel output according to current flow (with crosstalk) 2. Compute the output of all DQS Rx DLLs. 3. Compute the output of all DQ Rx DLLs. Step three takes the output of step 2 as an input, but Fangyi noted that all three steps exist in the current flow. The only new detail is that step 2 must be done prior to step 3. Fangyi said he thought this flow was actually simpler than the GetWave with clock_times as an input flow, because that flow requires the EDA tool to extract clock ticks from the DQS waveforms. slide 8: Summary Fangyi said that GetWave2 will address critical DDR5 modeling requirements that GetWave cannot. He said that no technical objections to the GetWave2's capabilities had been raised. He noted that ATM members had collaborated successfully on DDR5 issues recently with the DC_Offset BIRD (BIRD197.7), and he asked that we do it again. Arpad recalled that one of the primary questions from the previous meeting was: What are the magnitudes of these effects that GetWave2 can model? He asked if Fangyi could quantify the magnitude of these effects, as opposed to "yes" or "no" entries in the table on slide 6. He asked if there was a way to better quantify what we would lose by not having GetWave2. Could we quantify the impact on predicted eye height, or width, or predicted BER, etc.? Fangyi said that system designers typically look at their timing margin with respect to a mask specified at a given BER, for example 1e-16. He said that if you look at that margin without considering jitter amplification by the PI, you could over- estimate the margin by up to 40% of the DCD on DQS. Similarly, if you looked at the margin without considering the correlation of DQ and DQS voltage noise, you could underestimate the timing margin by 10% of the eye width. Michael noted that he expected to have more feedback from an internal review at the next meeting. He asked Fangyi if all of the effects Fangyi had enumerated meant that statistical modeling flow would be impossible because it wouldn't be accurate enough. He said that one common approach in model development was to perform low-level time domain simulations and then quantify and capture effects in parameters that could be used in a statistical simulation. Fangyi said that nothing he was describing precludes that. He said that GetWave2 would allow the time domain flow simulations to more accurately model the effects and capture them as parameter values for a statistical flow. Arpad noted that we had delayed a scheduled straw poll on whether to submit this BIRD to the Open Forum. Walter requested that we delay the poll for another week. Walter said his team had also researched these various effects and he would have a presentation on an alternate proposal at next week's meeting. Michael said he would have more information next week. Ambrish said they would take this up with their IP group as well and see if they thought there is a problem here that needs to be addressed. Randy asked Fangyi if the fact that this flow removes the internal CDR from GetWave means that we have to further discuss the simulation flow for training strobe timing with data. Fangyi said this is a modeling issue, not a simulation flow issue. Randy asked if we need some setup in simulation to get the strobe input to GetWave2 with the proper timing. Fangyi said if the Rx were for the controller, then the model contains the PI and can take care of the alignment. If the Rx were on the DRAM side, you can do a write-leveling training and set the skew. To elaborate on Randy's question, Walter noted that this discussion had been about DDR reads, where the memory is writing with a well-defined DQ-DQS skew and the controller (the Rx model in this case) has a PI to get the clock adjusted to the correct location. In the case of writes, the DRAM is the Rx and there is no PI. There are different delays for the DQ to the latch and the DQS to the latch. Memory knows what those delays are, but for a wide bus simulation the controller has to add the correct skew for each DQ. What flow would we use to get the correct controller skew? That could be some iterative process, and it's not a back-channel process because when you change the skew the cross talk moves around so you have to redo the simulation. That flow can be complicated, and that's the one Randy was asking about. Randy agreed. Walter said the DDR model could choose to implement GetWave2, or it could use the existing GetWave and do the clock generation internally. You can continue to use the internal CDR GetWave, but for a wide bus full handling of bit-by-bit DQ-DQS interaction we need a new flow. IBIS could leave it to the EDA tools to decide. An EDA tool could choose to run 32 simulations to sweep the skew, for example. That's a simple, but time-consuming flow. Fangyi agreed that, if solving this from the EDA tool side, the tool could sweep different values of skew. Alternatively, the EDA tool could do a two step approach. The first step would be a simulation for the write leveling training, and the tool could monitor the DQ and DQS waveforms at the DQ input and estimate the skew. Then, as step 2, it could use that skew estimate and run another simulation. Fangyi said you could also address this on the modeling side. The DRAM Rx model could use GetWave2 and do skew adjustment internally. This would be building some of the controller's functionality into the DRAM model, but that's an acceptable approach. Sometimes similar overlap is done in SerDes models. Randy said that we need some way to coordinate between what the models will do and what the EDA tools will do. Todd noted that in GetWave models with an internal CDR, the CDR model tends to provide some inherent low frequency jitter rejection, and only some high frequency jitter gets through. He said this behavior will be incorrect for clock-forwarding applications, unless there was very low skew between DQ and DQS. So, the inherent behavior of an internal CDR model will result in optimistic estimates in a clock-forwarding application. Jared noted SerDes models may contain a PI. It's typically an ideal model that might not cover some of the nonlinearities described in slide 2. He asked if the purpose of the new GetWave2 was to model the behavior of the DQS path. Fangyi said providing a physical model of the PI was one of the benefits of GetWave2. Arpad asked Fangyi to send the presentation to the ATM list. Fangyi agreed. BIRD201 Back-channel Statistical Optimization: Walter noted that this had been submitted to the Open Forum and open for discussion for several months. He had not yet received any feedback, and at the next Open Forum meeting he planned to move to schedule a vote. - Curtis: Motion to adjourn. - Randy: Second. - Arpad: Thank you all for joining. AR: Fangyi to send his "Clock Forwarding BIRD Discussion" presentation to the ATM list. ------------- Next meeting: 07 April 2020 12:00pm PT ------------- IBIS Interconnect SPICE Wish List: 1) Simulator directives